Assessing the quality of textual features in social media

نویسندگان

  • Flavio Figueiredo
  • Henrique Pinto
  • Fabiano Muniz Belém
  • Jussara M. Almeida
  • Marcos André Gonçalves
  • David Fernandes de Oliveira
  • Edleno Silva de Moura
چکیده

Social media is increasingly becoming a significant fraction of the content retrieved daily by Web users. However, the potential lack of quality of user generated content poses a challenge to information retrieval services, which rely mostly on textual features generated by users (particularly tags) commonly associated with the multimedia objects. This paper presents what, to the best of our knowledge, is currently the most comprehensive study of the relative quality of textual features in social media. We analyze four different features, namely, TITLE, TAGS, DESCRIPTION and COMMENTS posted by users, in four popular applications, namely, YouTube, Yahoo! Video, LastFM and CiteULike. Our study is based on an extensive characterization of data crawled from the four applications with respect to usage, amount and semantics of content, descriptive and discriminative power as well as content and information diversity across features. It also includes a series of object classification and tag recommendation experiments as case studies of two important information retrieval tasks, aiming at analyzing how these tasks are affected by the quality of the textual features. Classification and recommendation effectiveness is analyzed in light of our characterization results. Our findings provide valuable insights for future research and design of Web 2.0 applications and services. 2012 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SMILE: An Informality Classification Tool for Helping to Assess Quality and Credibility in Web 2.0 Texts

The data made available by Web 2.0 applications such as social networks, on-line chats or blogs have give access to multiples sources of information. Due to this dramatic increase in available information, the perception of quality and credibility plays an important role in social media, thus making necessary to discard low quality and uninteresting content. Moreover, the informal features of W...

متن کامل

An integrated Assessment System of Citizen Reaction towards Local Government Social Media Accounts

Agovernmentshouldusesocialmediaforcommunicatingwithitscitizen.Theengagement index score is one of the methods for assessing the rate of governmental success in using social media as a tool in establishing interactive relationships with its citizen. In general, the engagement index score is obtained by calculating the number of posts, number of likes and comments, and so forth on a single social...

متن کامل

The effect of social media quality and social presence on intention towards social commerce with the emphasis on educational services

Today, social media as a channel for offering educational services has become an extensive and effective educational tool for the students. This survey aimed to investigate the effective factors on intention towards social commerce of educational services among students. The statistical population included social media users at Isfahan university of medical sciences in Iran. 214 students were s...

متن کامل

Finding Thoughtful Comments from Social Media

Online user comments contain valuable user opinions. Comments vary greatly in quality and detecting high quality comments is a subtask of opinion mining and summarization research. Finding attentive comments that provide some reasoning is highly valuable in understanding the user’s opinion particularly in sociopolitical opinion mining and aids policy makers, social organizations or government s...

متن کامل

A Micro- and Macro-Level Descriptive-Analytical Study of Translation Criticism in Iran: Are We Moving within a Framework?

The present corpus-driven study addresses the current situation of translation criticisms published in print or online in the Iranian media. A sample of 17 criticisms (roughly 68,000 words altogether) from a variety of valid media outlets was compiled. Having been categorized into those with, and those without an ex- plicit theoretical framework, the criticisms were examined on two levels...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Process. Manage.

دوره 49  شماره 

صفحات  -

تاریخ انتشار 2013